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Abstract. This paper presents a new approach for optimizing multit- 
headed programs with pointer constructs. The approach has applications 
in the area of certified code (proof -carrying code) where a justification or a 
proof for the correctness of each optimization is required. The optimization 
meant here is that of dead code elimination. 

Towards optimizing multithreaded programs the paper presents a new 
operational semantics for parallel constructs like join-fork constructs, par- 
allel loops, and conditionally spawned threads. The paper also presents 
a novel type system for flow-sensitive pointer analysis of multithreaded 
programs. This type system is extended to obtain a new type system 
for live-variables analysis of multithreaded programs. The live-variables 
type system is extended to build the third novel type system, proposed 
in this paper, which carries the optimization of dead code elimination. 
The justification mentioned above takes the form of type derivation in our 
approach. 



1 Introduction 



One of the mainstream programming approaches today is multithreading. Us- 
ing multiple threads is useful in many ways like (a) concealing suspension 
caused by some commands, (b) making it easier to build huge software sys- 
tems, (c) improving execution of programs especially those that are executed 
on multiprocessors, and (d) building advanced user interfaces. 
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The potential interaction between threads in a multithreaded program com- 
plicates both the compilation and the program analysis processes. Moreover 
this interaction also makes it difficult to extend the scope of program analysis 
techniques of sequential programs to cover multithreaded programs. 

Typically optimizing multithreaded programs is achieved in an algorithmic 
form using data-flow analyses. This includes transforming the given program 
into a control-flow graph which is a convenient form for the algorithm to ma- 
nipulate. For some applications like certified code, it is desirable to associate 
each program optimization with a justification or a proof for the correctness of 
the optimization. For these cases, the algorithmic approach to program analy- 
sis is not a good choice as it does not work on the syntactical structure of the 
program and hence does not reflect the transformation process. Moreover the 
desired justification must be relatively simple as it gets checked within a trusted 
computing base. 

Type systems stand as a convenient alternative for the algorithmic approach 
of program analyses when a justification is necessary. In the type systems ap- 
proach, analysis and optimization of programs are directed by the syntactical 
structure of the program. Inference rules of type systems are advantageously 
relatively simple and so is the justification which takes the form of a type deriva- 
tion in this case. The adequacy of type systems for program analysis has already 
been studied for example in B1I2I3I 

Pointer analysis is among the most important program analyses and it cal- 
culates information describing contents of pointers at different program points. 
The application of pointer analysis to multithreaded programs results in in- 
formation that is required for program analyses and compiler optimizations 
such as live-variables analysis and dead code elimination, respectively. The 
live-variables analysis finds for each program point the set of variables whose 
values are used usefully in the rest of the program. The results of live-variables 
analysis is necessary for the optimization of dead code elimination which re- 
moves code that has no effect on values of variables of interest at the end of the 
program. 

This paper presents a new approach for optimizing multithreaded programs 
with pointer constructs. The scope of the proposed approach is broad enough to 
include certified (proof-carrying) code applications where a justification for op- 
timization is necessary. Type systems are basic tools of the new approach which 
considers structured parallel constructs such as join-fork constructs, parallel 
loops, and conditionally spawned threads. The justifications in our approach 
take the form of type derivations. More precisely, the paper presents a type 
system for flow-sensitive pointer analysis of multithreaded programs. The live- 
variables analysis of multithreaded programs is also treated in this paper by a 
type system which is an extension of the type system for pointer analysis. The 
extension has the form of another component being added to points-to types. 
The dead code elimination of multithreaded programs is then achieved using 
a type system which is again an extension of the type system for live-variables 
analysis. This time the extension takes the form of a transformation component 



added to inference rules of the type system for live-variables analysis. To prove 
the soundness of the three proposed type systems, a novel operational semantics 
for parallel constructs is proposed in this paper. 



1. x := &y; 

2. *x:=2; 

3. par{ 

4. \xj := 4| 
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6. *x := 6) 

7. ); 

8. x := 8; 

9. x := 9 
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Fig. 1. A motivating example 



Some comments are in order. The language that we use does not allow 
pointer arithmetic, so it can be thought of as a Pascal-style language. The parallel 
construct assumes that each thread is executed atomically. This is a restriction 
that is not very common to many implementation of threads. But based on our 
construct, it should be easy to treat a more general construct. Writing computer 
programs to efficiently calculate the types, using our type systems, is not hard. 
This is so because each time the program calculates a type, the program will be 
given a pre-type and seeking its post type or vice versa. In the semantics our 
langauge, we assume that Z and the set of addresses, Addrs, are disjoint sets. 

Motivation 

Figure Q] presents a motivating example of the work presented in this paper. 
Consider the program on the left-hand-side of the figure. Suppose that at the 
end of the program we are interested in the values of x and y. We note that the 
assignment in line 8 is dead code as the variable X is modified in line 9 before we 
make any use of the value that the variable gets in line 8. The assignment in line 
2 indirectly modifies y which is modified again in the par command before any 
useful use of the value that y gets in line 2. Therefore line 2 is dead code. The 
par command has two threads which can be executed in any order. If the first 
thread is executed first then assignments in lines 4 and 5 become dead code. 
If the second thread is executed first then assignments in lines 5 and 6 become 
dead code. Therefore the dead code in the par command is the assignment in 
line 5 only. 

This paper presents a technique that discovers and removes such dead code 
in parallel structured programs with pointer constructs. The output of the tech- 
nique is a program like that on the right-hand-side of Figure [TJ In addition to 
the join-fork construct (par), the paper also considers other parallel constructs 
like conditionally spawned threads and parallel loops. With each such program 



optimization, our technique presents a justification or a proof for the correctness 
of the optimization. The proof takes the form of a type derivation. 

Contributions 

Contributions of this paper are the following: 

1 . A simple yet powerful operational semantics for multi-threaded programs 
with pointer constructs. 

2. A novel type system for pointer analysis of multithreaded programs. To our 
knowledge, this is the first attempt to use type systems for pointer analysis 
of multithreaded programs. 

3. A new type systems for live-variables analysis of multithreaded programs. 

4. An original type system for the optimization of dead code elimination for 
multithreaded programs. 

Organization The rest of the paper is organized as follows. The language that 
we study (a while language enriched with pointer and parallel constructs) and 
an operational semantics for its constructs are presented in Section[2] Sections|3] 
and H] present our type systems for flow-sensitive pointer and live-variables 
analyses, respectively. The type system carrying program optimization is intro- 
duced in Section|5] Related work is discussed in Section[6] 

2 Programming language 

This section presents the programming language (Figure |2) we use together 
with an operational semantics for its constructs. The language is the simple 
while language [4J enriched with commands for pointer manipulations and 
structured parallel constructs. 



e e Aexprs 
b € Bexprs 
S e Stmts 



n € Z, x € Var, and © 6 (+, -, x) 
x | n | ei © e 2 

true | false \ -b \ e\ = e 2 I e\ < ?2 I bi A b 2 \ b\ V b 2 

x := e | x := &cy \ *x := e \ x := *y \ skip \ Si; Sz \ if b then S t else S / | 

while b do S, | par{\Si\, ... , { S , , } } | par-ifl(b lt Si), . . . , (b n , S„)) | par-for{S}. 



Fig. 2. The programming language. 



The parallel constructs include join-fork constructs, parallel loops, and con- 
ditionally spawned threads. The par (join-fork) construct starts executing many 
concurrent threads at the beginning of the par construct and then waits until the 



completion of all these executions at the end of the par construct. Semantically, 
the par construct can be expressed approximately as if the threads are executed 
sequentially in an arbitrary order. The parallel loop construct included in our 
language is that of par-for. This construct executes, in parallel, a statically un- 
known number of threads each of which has the same code (the loop body). 
Therefore the semantics of par-for can be expressed using that of the par con- 
struct. The construct including conditionally spawned threads is that of par-if. 
This construct executes, in parallel, its n concurrent threads. The execution of 
thread (fc„ S,) includes the execution of S, only if fr, is true. 

One way to define the meaning of the constructs of our programming lan- 
guage, including the parallel constructs, is by an operational semantics. This 
amounts to defining a transition relation w between states which are defined 
as follows. 

Definition 1. 1. Addrs = {x' \ x e Var) and Val = Z U Addrs. 
2. state e States = {abort} U [y \ y e T = Var — > Val}. 

The semantics of arithmetic and Boolean expressions are defined as usual 
except that arithmetic and Boolean operations are not allowed on pointers. 

[w]y = n l&cxjy = x' \x\y = y(x) \true\y = true \false\y = false 

■«* - {r w X^I: ■ £ z ' 

{ (false otherwise. 

! if lejy i Zor|[e 2 ]]y * Z, 

Idly ^ Idly otherwise. 



D>i < e z Jy ■■ 



The inference rules of our semantics (transition relation) are defined as follows: 
\e\y = ! \e\y + ! y(x) = z' z := e : y ~> state 

x := e : y abort x:=e:y-^> y[x i-> [e]y] *x := e : y state 

y(x) $ Addrs y(y) = z' x := z : y ^> y' 



*x := e : y ~> abort x := &y : y y[x ^ y'] x := *y : y ^ y' 

y(y) £ Addrs Si : y -w* abort S\ : y ~* y" S 2 : y" ^ state 



x := *y : y ™* abort S ^P ■ 7 Y Si; Sz : y abort Si; S2 : y ^> state 

p]y = ! [fe]y = true S t :y state 

if b then S t else Sf : y ^ abort if b then S f else Sf : y ™* state 
Ply = false Sf :y «** state \b\y = ! p]y = false 



if b then S t else Sf : y ~> state while b do S t : y ~*-> sfoorf zyfo'/e b do S t : y y 
[fe]y = true S : y y" zy/zf/e b do S t : y" state \bjy = true S : y «** abort 



while b do St : y state 



while b do St : y abort 



• Join-fork: 



par{{Si},...,{S n }} :y«**Y par{{Si},. . .,{S„}} :y-^ abort 

t there exist a permutation : {l,...,n} — > {1, and n + 1 states 

y = y\, . . . , y n+ \ = y such that for every 1 < i < n, Sg(i) : y< — » y;+i . 

J there exist m such that 1 < m < n, a one-to-one map j3 : \l,...,m) — > 
{!,... ,n\, and m + 1 states y = y\,..., y m +\ = abort such that for every 
1 < i < m, S^ (f) : y,- -» y M . 

• Conditionally spawned threads: 

par{{if b 1 then Si else skip], ...,{if b n then S n else skip}} : y «** y' 
par-iflih, Si), . . . , (b„, S„)} :y ~^y' 

par{{if b 1 then Si else skip}, ...,{if b n then S n else skip}} : y «** abort 
par-iflih, Si), ... , {b n , S„)} : y abort 

• Parallel loops: 



3n. par{{S}, {S}} : y ^»y' 3w. par{{S}, . . . , {S}} : y abort 
par-for{S} :y^y' par-for{S} : y abort 

A simple example for the par-for command is par-for{x := 10). The execution 
of this command amounts to fixing a number randomly, say 7, and then to 
concurrently execute the seven threads {x := 10}i, ...,{x:= lOjz- 



3 Pointer analysis 

In this section, we present a novel technique for flow-sensitive pointer analysis 
of structured parallel programs where shared pointers may be updated simul- 
taneously. Our technique manipulates important parallel constructs; join-fork 
constructs, parallel loops, and conditionally spawned threads. The proposed 
technique has the form of a compositional type system which is simply struc- 
tured. Consequently results of the analysis are in the form of types assigned to 
expressions and statements approved by type derivations. Therefore a type is 
assigned to each program point of a statement (program). This assigned type 
specifies for each variable in the program a conservative approximation of the 
addresses that may be assigned to the variable. The set of points-to types PTS 
and the relation |=cTx PTS are defined as follows: 

Definition 2. 1. PTS = {pts | pts : Var -» 2 Addrs }. 

def 

2. pts < pts' <=> Vx e Var. pts(x) c pts f (x). 

def 

3. y \= pts <^^> (Vx e Var. y(x) e Addrs => y(x) e pts(x)). 



The judgement of an expression has the form e : pts — > A. The intended 
meaning of this judgment, which is formalized in Lemma [TJ is that A is the 
collection of addresses that e may evaluate to in a state of type pts . The judgement 
of a statement has the form S : pts — » pts' . This judgement simply guarantees 
that if S is executed in a state of type pts and the execution terminates in a 
state y' , then y' has type pts'. Typically the pointer analysis for a program S is 
achieved via a post-type derivation for the bottom type (mapping variables to 
0) as the pre-type. 

The inference rules of our type system for pointer analysis are the following: 

e : pts — > A 

( . =fJ) 



n:pts^% x : pts ^ pts(x) e l ®e 2 :pts^>% X :=e:pts^> pts[x i-> A] 



x := &y : pts — > pts[x h-> jy'j] skip : pts — > pts 

Vz' e pts(y). x := z : pts — » pis' Vz' e pts(x). z := e : pts — > pts' 

; (:=*0 ; (* -= p ) 

x := *y : pts — > pts *x := e : pts — > pts 

S, : pts U U.^/ptS: — » pts. Si : pts — > pts" S2 : pts" — > pts' 

' (parP) —(sap) 



par{[Si), {S„)( : pts — > U,pts ; Si; S2 : pts — > pts' 

par{{if b 1 then Si etse sfcz'p), ...,[ifb n then S n else skip}} : pts — > pts' 

par-ifl(b lr Sj), . . . , (b„, S„)l : pts -> pts' 



(par-if) 



S : pts U pts' — > pts' S t : pts — > pts' Sr :pts ^> pts' 

■{par-fof) (if) 



par-for{S} : pts — > pts' if b then S t else Sf : pts — > pts' 

Si : pts — > pts pts^ < pts, S : pts, — > pts, pts 7 < pts, 

(whf) — (cs<f) 



a)Me b do S t : pts — > pts S : pfSj — > pfSj 

The inference rules corresponding to assignment commands are clear. For the 
rule (par p ) of the join-fork command, par, one possibility is that the execution 
of a thread S, starts before the execution of any other thread starts. Another 
possibility is that the execution starts after executions of all other threads end. Of 
course there are many other possibilities in between. Consequently, the analysis 
of the thread S, must consider all such possibilities. This is reflected in the pre- 
type of S, and the post-type of the par command. Clearly pts, in (par?), is the given 
pre-type for the par command and for witch the rule calculates a post- type. The 
union operation in the rule (par p ) makes the pre-type of threads general enough. 
Similar explanations clarify the rules (par - if p ) and (par - for p ). 

We note that a type invariant is required to type a while statement. Also to 
achieve the analysis for one of the par's threads we need to know the analysis 
results for all other threads. However obtaining these results requires the result 
of analyzing the first thread. Therefore there is a kind of circularity in rule {par p ). 
Similar situations are in rules (par-if) and (par - for p ). Such issues can be treated 
using a fix-point algorithm. The convergence of this algorithm is guaranteed as 
the rules of our type system are monotone and the set of points-to types PTS is 
a complete lattice. What makes calculations actual simple is that for any given 



program the lattice PTS is finite. The rule (esq?) is necessary to calculate a type 
invariant. 

Lemma 1. 1. Suppose e : pts — > A and y |= pts. Then Je]]y e Addrs implies 
MyeA. 

2. pts < pts' <==> (Vy. y \= pts => y \= pts'). 

Proof. The first item is obvious. The left-to-right direction of (2) is easy. The other 
direction is proved as follows. Suppose y' e pts(x). Then the state {(x, y'), (t, 0) | 
t e Var \ {x}\ is of type pts and hence of type pts' implying that y' e pts'(x). 
Therefore pts(x) C pts'(x). Since x is arbitrary pts < pts'. 

Theorem 1. (Soundness) Suppose that S : pts — > pts', S : y y' , and y \= pts. 
Then y' \= pts'. 

Proof. The proof is by structural induction on the type derivation. We demon- 
strate some cases. 

- The case of (:= p ): In this case pts' = pts[x t-> A] and y' = y[x \-> \e\y]. 
Therefore by the previous lemma y \= pts implies y' |= pts' . 

- The case of (* := v ): In this case there exists z e Var such that y(x) = z' and 
z := e : y ^ y' . Because y \= pts, z' e pts(x) and hence by assumption 
z := e : pts — > pts' . Therefore by soundness of (:= p ), y' \= pts'. 

- The case of (par?): In this case there exist a permutation 6 : {1, . . .,n) — > 
{l,...,n} and n + 1 states y = y\,...,y n +\ = y' such that for every 
1 < i < n, S e( ,) : yi y,+\- Also y x \= pts implies y x \= pts U U ;¥0(1) pfsy. 
Therefore by the induction hypothesis 72 N P is e(\y This implies 72 t= 
pts U Vj^ewptSj. Again by the induction hypothesis we get 73 (= pts e(Xj . 
Therefore by a simple induction on n, we can show that y' = y n+ \ |= pts g ^ 
which implies y' \= pts' = UjptSj. 

n- times 

- The case of (par - for?): In this case there exists n such that par{{S), {S}} : 
y ^ y' . By induction hypothesis we have S : pts U pts' — > pts'. By (par?) we 

n— times 

conclude that par{{S), . . .,{S)) : pts ^ pts'. Therefore by the soundness of 
(parP), y' \= pts' . 



4 Live-variables analysis 

In this section, we present a type system to perform live-variables analysis for 
pointer programs with structured parallel constructs. We start with defining 
live-variables: 

Definition 3. -A variable is usefully used if it is used 

• as the operand of the unary operation *. 

• in an assignment to a variable that is live at the end of the assignment, or 



• in the guard of an if-statement or a while-statement, 
- A variable is live at a program point if there is a computational path from that 
program point during which the variable gets usefully used before being modified. 

Definition 4. The set of live types is denoted by L and equal to PTS x P(Var). The 
second component of a live type is termed a live-component. The subtyping relation < 

def 

is defined as: (pts, I) < {pis' , Y) <=> pts < pts and I 2 Y. 

The live-variables analysis is a backward analysis. For each program point, 
this analysis specifies the set of variables that may be live (according to the 
definition above) at that point. 

Our type system for live-variables analysis is obtained as an enrichment of 
the type system for pointer analysis, presented in the previous section. Hence 
one can say that the type system presented here is a strict extension of that 
presented above. This is so because the result of pointer analysis is necessary to 
improve the precision of the live-variables analysis. This also gives an intuitive 
explanation of the definition of live types above. 

The judgement of a statement S has the form S : {pts, I) — > (pts',Y). The 
intuition of the judgement is that the presence of live-variables at the post-state 
of an execution of S in Y implies the presence of live-variables at the pre-state of 
this execution in /. The intuition agrees with the fact that live-variables analysis 
is a backward analysis and gives an insight into the definition of y \= I below. 

Suppose we have the set of variables V that we have interest in their values 
at the end of executing a statement S and the result of pointer analysis of S (in 
the form S : pts — > pts'). The live-variables analysis takes the form of a pre-type 
derivation that calculates a set I such that S : (pts, I) — > (pts', Y). The idea here 
is that by proceeding from the last point of the program, calculating pre-types, 
we achieve the backward analysis. 

The inference rules for our type system for live-variables analysis are as 
follows. 

x := e : pts — > pts' x i V x := e : pts — > pts' x el' 

(■=') (■=') 

x := e : (pts, I') -> (pts', V) 1 x := e : (pts, (V \ \x)) U FV(e)) -> (pts', I') 2 

(:=&') 

x := &y : (pts, I' \ {x}) — > (pts[x i-» [y}],l) skip : (pts, I) — > (pts, I) 

x := *y : pts — > pts' x <£ Y 

(:= *[) 

x:= *y:(pts,l'U{y})^(pts',Y) 

x := *y : pts — > pts' x el' 

— (:= 4) 

x := *y : (pts,(V \ {x}) U [y,z \ z' e pts(y)\) (pts',Y) 

*x := e : pts — > pts' pts(x) n V = *x := e : pts — > pts' pts(x) Pi /' # 

(* •=[) (* ■=[) 

*x:=e: (pts, V U {x}) (pts', I') *x := e : (pts, Y U FV(e) U {x}) ^ (pts', Y) 

Si : (pts U Uj^pts.Ji) (pts t ,r U Ujtilj) 

(par 1 ) 

par^L.-ASn}} : (pts, U,/,) -> (U,pts t ,Y) 



par{{if \>i then S\ else skip], . .. ,{if b n then S„ else skip}} : (pts, I) — > (pts', V) 

par-if{(bi, Sj), . . . , (b n , S„)} : (pts, I) -» (pis', /') ^ ' * 

S:(ptsUpts',l)->(pts',l'Vl) 

(par-for ) 

par-for{S} : (pts, I) — > (pts' ,Y) 

Si : (pts, I) -> (pts", I") S 2 : (pts", I") -> (pis', /') 



Si;S 2 : (pts,l) -» (pis', Z') 
S,,S/ : (pfs,Z) -> (pts',/') 



■ (say 1 ) 



(if) 



if b then S t else S f : (pts, I U FV(b)) (pts', Y) 

l = l'UFV(b) S t : (pts,V) -> (pts,l) 

(wht) 

while b do S t : (pts, I) — > (pts, V) 

(pts[, l[) < (j)ts v h) S : (pts v h) ^ (pts 2 , l 2 ) (pts 2 , l 2 ) < (pts' 2 , 1' 2 ) 
S:(pts' 1 ,l' 1 )^(pts' 2 ,l' 2 ) 



(esq 1 ) 



For the command *x := e, we have two rules, namely (* :—\) and (* := l 2 ). In 
both cases, calculating the pre-type from the post-type includes adding x to the 
post-type. This is so because according to Definition|3l x is live at the pre-state of 
any execution of the command. The rule (* deals with the case that there is 
no possibility that the variable modified by this statement is live (pfs(x) fl I' = 0) 
at the end of an execution. In this case there is no need to add any other variables 
to the post-type. The rule (* : =1 2 ) deals with the case that there is a possibility 
that the variable modified by this statement is live (pts(x) fl Z' + 0) at the end of 
an execution. In this case, there is a possibility that free variables of e are used 
usefully according to Definition [3] Therefore free variables of e are added to 
the post-type. This gives an intuitive explanation for rules of all the assignment 
commands. The intuition given in the previous section for the rules (par p ) helps 
to understand the rules for the parallel constructs, (par 1 ), (par-if), and (par-for 1 ). 

Towards proving the soundness of our type system for live-variables analy- 
sis, we introduce necessary definitions and results. 

def 

Definition 5. 1. y \=i pts <=> Vx e I. y(x) e Addrs => y(x) e pts(x). 
2. y ~, y' Wx e I y(x) = y'(x). 

def 

3- y ~(pts,/) Y <=^> y h pts, y h pts, and y ~iy. 

Definition 6. The expression y \= I denotes the case when there is a variable that is live 
at that state (computational point) and is not included in I. A state y has type (pts, I), 
denoted byy\= (pts, I), ify |=; pts and y \= I. 

The following lemma is proved by structural induction on e and b. 

Lemma 2. Suppose that y and y' are states and I and Y e ^(Var). Then 

2. If I 2 /' and y ~; y' , then y ~j- y' . 

2. Ifl = l'U FV(e) and y ~; y', then \e\y = le\y' and y ~ v y'. 

3. If I = V U FV(b) and y ~; y', then lb\y = Ibjy' and y ~ v y'. 



The following lemma follows from Lemma Q] 
Lemma 3. Suppose that y |=; pts, FV(e) c I, and e : pts — > A. Then 

[e]y e Addrs => |[e]|y 6 A. 

Proof. Consider the state y' , where y' = Ax. if x e FV(e) then y(x) else 0. It is not 
hard to see that [ejy = \e\y' and y' \= pts. Now by Lemma [TJ \e\y' 6 Addrs 
implies My' £ A which completes the proof. 

Theorem 2. 2. (pts, I) < (pts', Y) => (Vy. y h pts => y |= ; - pis'). 

2. Suppose that S : (pts, I) — > (pts' ,Y) and S : y ^ y'. Then y (=; pfs implies 
y'\=i'Pts'. 

3. Suppose that S : (pts, I) — > (pts',l') and S : y y'. TZzew y (= Z implies y' \= V. 
This guarantees that if the set of variables live at y' is included in I', then the set of 
variables live at y is included in I. 

Proof. 1. Suppose y (=; pfs. This implies y |=/< pis because Z' c Z. The last fact 

implies y pis' because pfs < pts'. 
2. The proof is by induction on the structure of type derivation. We show some 
cases. 

(a) The type derivation has the form (:=l). In this case, pfs' = pfs[x i— > A] and 
y' = y[x i-> |[e]]y]. Therefore y pfs implies y' [=;< pfs' because x ^ Z'. 

(b) The type derivation has the form (:= l 2 ). In this case, e : pts — > A, pfs' = 
pfs[x i-^ A], y' = y[x ^ lefy], and I = (V \ {x}) U Fy(e). Therefore by 
Lemma|3]it is not hard to see y' \=i> pts' . 

(c) The type derivation has the form ( := ) . In this case, for every z' e pts(y), 
we have x := z : pfs — > pts' ,y(y) = z' , and x := z : y — > y'. We have 
z' e pfs(y), because y e Z and y (=; pfs. Therefore by {'={), we have 
x := z : (pfs, Z') — > (pfs', I'). Now y (=; pfs amounts to y (=/' pfs. Hence we 
get y' (=// pfs' by soundness of ('■—{). 

(d) The type derivation has the form (:= * l 2 ). In this case, for every z' e 
pfs(y), we have x := z : pfs — » pts',y(y) = z', x := z : y — > y', and 
I = (V \ {x}) U {y,z | z' e pfs(y)}. We have z e pfs(y) because y (=; pfs and 
y e Z. Therefore by (:=') we have x := z : (pfs, (Z' \ {x}) U (z)) — > (pfs', I'), 
y (=/ pfs implies y N(Z'\{x})u{z} P^s. Hence by soundness of (-—l), we get 
y' t=P pfs'. 

(e) The type derivation has the form (* :— .). In this case, for every z' e pfs(x), 
we have z := e : pfs — > pts' ,y(x) = z', and z := e : y — > y'. We have 
z' e pfs(x), because x e Z and y |=; pfs. Therefore by ('■=[), we have 
z := e : (pts, Y) — > (pfs', Z') because pfs(x) n Y = 0. Now y \=i pts amounts 
to y pfs. Hence we get y' \=v pts' because z := e : (pts, Y) — > (pfs', Z') 
and by soundness of ('■={). 

(f) The type derivation has the form (* '— 2 )- In this case, for every z' e 
pfs(x), we have z := e : pfs — > pts',y(x) = z', z := e : y — > y', and 
Z = Z' U FV(e) U {x}. We have z e pfs(x) because y (=; pfs and x e Z. 
Therefore by (:=^) we have x := z : (pfs,(Z' \ {z}) U FV(e)) (pts',Y). 



y |=/ pts implies y \=(i'\{ z })uFV(e) pts. Hence by soundness of (:=;,), we get 

y' h'P te '- 

(g) The type derivation has the form (par 1 ). In this case there exist a permu- 
tation 6 : {1, . . . , n) — > {1, . . . , n] and n + 1 states y = y n +\ = y' 
such that for every 1 < i < n, S 6 (i) ■ y% — > y;+i- Also y\ (=; pts im- 
plies yi (=; e(1) pts U Uj±g(i)ptSj. Therefore by the induction hypothesis 
72 h'uu^ea)// V ts e(\y This implies yi h e(2) pts U UpeftPtSj- Again by the 
induction hypothesis we get 73 N'uu^sp^ V^ s e(zy Therefore by a sim- 
ple induction on n, we can show that y' = y n+ \ F/'uu ;¥e(n) ! ; - P* s e(„) which 
implies y' |=;/ pts' = UjptSj. 

(h) The type derivation has the form (par-for 1 ): In this case there exists n 

n-times 



such that par{{S), . . ., {S}) : y ^> y'. By induction hypothesis we have 

n-times 

, ~ V 

S : (ptsUpts',1) -» (pts',/U/'). By (par') weconcludethatpar{{S},...,{S}} : 
(pts, Z) — > (pts', /'). Therefore by soundness of (par 1 ), we get y' pts'. 
3. The proof is also by induction on the structure of type derivation and it is 
straightforward . 

The proof of the following corollary follows from Theorem [2] 

Corollary 1. Suppose S : y -w» y' and S : (pts, I) — > (pts', Z'). T/ien y |= (pts, I) 
implies y' \= (pts' , Y). 

Theorem 3. Suppose that S : (pts, I) — > (pts', Y), S : y ^ y', y ~( p t s ,/) y*, and S does 
not abort at y*. Then there exists a state y' t such that S : y* — > y' t and y' ~( v ts',v) yi- 

Proof. The proof is by induction on structure of type derivation. We demonstrate 
some cases: 

1. The type derivation has one of the forms ('■=[) and ('.- : 2 ). In this case, pts' = 
pts[x i-> A] and y' = y[x i-» [ejy]. We take y^ = y»[x i-» My*]. 

2. The type derivation has the form (:= *l) or (:= *'). In this case, Vz' e pts(y), 
we have x := z : pts — > pts',y(y) = z', and x := z : y — » y'. We set y» = 
y»[x i-> y,(z)]. 

3. The type derivation has one of the forms (:= *,) and (:= * 2 ). In this case, 
Vz' e pts(x), we have z := e : pts — » pts' ,y(x) = z', and z := e : y — » y'. We let 
y, = y*[z i-> My,] 

4. The type derivation has the form (par 1 ). In this case there exist a permutation 
: {1, . . . , n] — > {1, . . .,n] and « + 1 states y = yi, . . . , y«+i = y' such that for 
every 1 < i < n, Sg^ : yi — > y,+i. We refer to y» as y»i. We have y\ ~( P f S/ u i / j ) y»i 
which implies yi ~(ptsuu^ e(1) pts / ,!e( 1) ) y»i- Therefore by induction hypothesis, 
there exists y» 2 such that S 6 (i) : y»i -» y»2 and y 2 ~(pfc fl(1) /uu / / W) ) y*2 which 
implies y 2 ~(ptsuu^ m pts j ,i e (2)) y*2- Therefore a simple induction on n proves the 
required. 



5 Dead code elimination 



This section introduces a type system for dead code elimination. Given a pro- 
gram and a set of variables whose values concern us at the end of the program, 
there may be some code in the program that has no effect on the values of these 
variables. Such code is called dead code. The type system presented here aims at 
optimizing structured parallel programs with pointer constructs via eliminat- 
ing dead code. In the form of a type derivation, the type system associates each 
optimization with a proof for the soundness of the optimization. Optimizing a 
program may result in correcting it i.e. preventing it from aborting. Of course 
this happens if the removed dead code is the only cause of abortion. 

The type system presented here has judgements of the the form: S : (pts, I) —> 
(pis', V) e — » S'. The intuition is that S' optimizes S towards dead code elimina- 
tion (and may be program correction). As mentioned early in many occasions, 
the derivation of such judgement provides a justification for the optimization 
process. The form of the judgement makes it apparent that the type system 
presented in this section is built on the type system for live-variables analysis. 

Algorithm: parallel-optimize 

- Input : a statement S of the language presented in Section[2]and a set of variables /' 
that we consider live (their values concern us) at the end of executing S; 

- Output: an optimized and may be corrected version S' of S such that the relation 
between S and S' is as stated in Theorem|4] 

- Method : 

1. Find pts such that S : X — > pts in the type system for pointer analysis. 

2. Find / such that S : (±, I) — > (pts, Y) in the type system for live-variables analysis. 

3. Find S' such that S : (±,l) — * [pts,V) '—* S' in the type system for dead code 
elimination. 

Fig. 3. The algorithm optimize-parallel 

Figure |3] outlines an algorithm, parallel-optimize, that summarizes the op- 
timization process. A pointer analysis that annotates the points of the input 
program with pointer information is the first step of the algorithm. This step 
takes the form of a post type derivation of S, in our type system for pointer 
analysis, using the bottom points-to type _L = {x i— > | x e Var) as the pre type. 
Secondly, the algorithm refines the pointer information obtained in the first 
step via annotating the pointer types with type components for live-variables. 
Using our type systems for live-variables analysis, this is done via a pre type 
derivation of S for the set /', the set of variables whose values concerns us at 
the end of execution, as the post type. Finally, the information obtained so far 
is utilized in the third step to find S' via using the type system for dead code 
elimination proposed in this section. Applying this algorithm to the program 
on the left-hand side of Figured] results m the program on the right-hand side 
of the same figure. The details of this application is a simple exercise. 



The inference rules of our type system for dead code elimination are as 
follows: 

x := e : pts — > pfs' x i I' 

H) 

x := e : (pts,l') — > (pts ,1) skip 

x := e : pts — > pts' x el' 

(:= e ) 

x := e : (pts, (V \ {x}) U FV(e)) (pfs', Z') x := e 2 

x := &y : (pts, I') ^ (pts[x h» (y'}], I') stop 1 S **P : "» ^ ski P 

xel' 

(:= & 2 ) 

x := &y : (pts,/' \ (x)) -> (pts[x ^ [y}],l) ^ x := &y 

x := *y : pts — » pts' x i V 

(■■=*{) 



x := *y : (pfs, Z' U (y)) — > (pts' ,V) sfa'p 
x := *y : pfs — > pfs' x e Z' 



x := *y : (pfs,(Z' \ {x}) U {y,z \ z' e pts(y)}) -» (pfs',/') x := *y 

*x := e : pts — > pfs' pfs(x) n Z = 

(* :=J) 

*x := e : (pts, I U (x)) — > (pts ,1) sfc/p 

*x := e : pfs — > pts' pfs(x) n Z' # 

(• :=P 

*x := e : (pts, Z' U jx} U FV(e)) -> (pts', V) <^> *x := e 

Si : (pts U Utfpteyli) -» (pfej,/' U U ;¥i Z ; ) S; 



■ C= *,) 



(paf) 



par\{S 1 \,...,{S n }\ : (ptsMh) -» (Ujpts,., /') ^» purHSJ}, . . . , {S;}} 

par{{if b\ then S\ else skip], ...,{ifb n then S n else skip}] : (pts, I) — > (pfs', I') 
par{[if bi then S[ else skip], ...,{ifb n then S'„ else skip}} 

par-if{(b 1 ,S 1 ),...,(b„ l S n )} : (pts,l) -» (pts',/') 
par-if{(b 1 ,S\),...,(b n ,S'„)} 

S : (pts U pts', Z) -> (pfs', V U Z) S' 

■ (par-for*) 



(par-if) 



par-for\S} : (pts, I) — > (pts' ,1') par-for{S'} 
Si : (pts, I) -» (pts", Z") S; S 2 : (pts", Z") -> (pts', Z') S 2 

Si;S2 : (pfs,Z) — > (pts', I') Sj;S 2 
S t : (pts, Z) -> (pts', Z') S; S/ : (pts, Z) -» (pts', V) <^> S' f 



if b then S, else S f : (pts, I U FV(b)) -> (pts', V) ^ if b then S' t else S' f 
I = V U FV(b) S t : (pts, V) -> (pts, I) S' t 



(self) 
(if) 



while b do S t : (pts, I) — > (pts, V) while b do S' t 
(pts[, l\) < (ptSj, Zi) S : (ptSj, h) -> (pts 2 , l 2 ) S' (pts 2 , Z 2 ) < (pts' 2 , V 2 ) 

S : (pts'^l'J -> (pts' 2 ,l' 2 ) ^ S' 



(whf) 

>o,l'l) 

(csq e ) 



When optimizing programs it is important to guarantee that if (a) the original 
and optimized programs are executed in similar states, and (b) the original 
program ends at a state (rather than abort), then (a) the optimized program does 



not abort as well, and (b) the optimized program reaches a state similar to that 
reached by the original program. Indeed, this is guaranteed by the following 
theorem. 

Theorem 4. (Soundness) Suppose that S : (pts, I) —> (pts' r I') S' and y ~( V t s ,i) J*- 
Then 

1. IfS :y ~+*y' , then there exists a state yi such that S' : y, — » yi and y' ~(pts',i>) yi- 

2. If S' : y, —> y', and S does not abort at y, then there exists a state y' such that 
S:y*»y' and y' ~(pts>,i>) yi- 

The proof of this theorem is by induction on the structure of type derivation 
and it follows smoothly from Theorem [3] More precisely Theorem [3] is used 
when S' = S. When S' = skip, we take yi = y* in 1. We note that the requirement 
of Theorem [3] that S does not abort at y» is guaranteed when this theorem is 
called in the proof of Theorem |U 

6 Related work 

Analysis of multithreaded programs: The analysis of multithreaded programs 
is an area that receives growing interest. It is a challenging area [5] as the 
presence of threading complicates the program analysis. The work in this area 
can be classified into two main categories. One category includes techniques that 
were designed specifically to optimize or correct multithreaded programs. The 
other category includes techniques whose scope was extended from sequential 
programs to multithreaded programs. 

The first category mentioned above covers several directions of research; 
synchronization analysis, deadlock, data race, and memory consistency. The 
purpose in the analysis of synchronization constructs [6 7J is to clarify how 
the synchronization actions apart executions of program segments. The result 
of this analysis can be used by compiler to conveniently add join-fork con- 
structs. Clearly, adding such join-fork constructs will reduce the run time of the 
program. One problem of multithreading computing is deadlock which results 
from round waiting to gain resources. Researchers have developed various tech- 
niques for deadlock detection [8 9 10 1. The situation when a memory location is 
accessed by two threads (one of them writes in the location) without synchro- 
nization is called data race. On direction of research in this category focuses 
on data race detection [11 J. The analysis of multithreaded programs becomes 
even harder in the presence of a weak memory consistency model because 
such model does not guarantee that a write statement included in one thread 
is observed by other threads in the same order. However such model simplifies 
some issues on the hardware level. The work in this direction, like [12J, aims at 
overcomes the drawbacks of using a simple consistency memory model. Asym- 
metric Distributed Shared Memory (ADSM), a programming model serving 
heterogeneous computing, is introduced in [12 [. ADSM manages a shared vir- 
tual memory to enable CPUs-access to addresses on the accelerator real memory. 



Under the second category mentioned above comes several directions of re- 
search. One such direction is the using of flow-insensitive analysis techniques to 
analyze multithreaded programs 1113114 1 . Although flow-insensitive techniques 
are not very precise, some applications can afford that. Examples of program 
analyses whose techniques were extended to cover multithreaded programs are 
code motion |15[, constant propagation [16|, data flow for multithreaded pro- 
grams with copy-in and copy-out memory semantics 1171181 , and concurrent 
static single assignment form [19). 

The problem with almost all the work refereed to above is that it does not 
apply to pointer programs. More precisely, for some of the work the application 
is possible only if we have the result of a pointer analysis for the input pointer 
program. The technique presented in this paper for optimizing multithreaded 
programs has the advantage of being simpler and more reliable than the op- 
timization techniques refereed to above that would work in the presence of a 
pointer analysis. 

Pointer analysis: The pointer analysis for sequential programs has been stud- 
ied extensively for decades ||20ll . One way to classify the work in this area is 
according to properties of flow-sensitivity and context-sensitivity. Hence the 
work is classified into flow-sensitivity, flow-insensitivity, context-sensitivity, 
and context-insensitivity. 

Flow-sensitive analyses [21 22 23 J, which are more natural than flow- 
insensitive to most applications, consider the order of program commands. 
Mostly these analyses perform an abstract interpretation of program using 
dataflow analysis to associate each program point with a points-to relation. 
Flow-insensitive analyses [ 24 25 [ do not consider the order of program com- 
mands. Typically the output of these analyses, which are performed using a 
constraint-based approach, is a points-to relation that is valid all over the pro- 
gram. Clearly the flow-sensitive approach is more precise but less efficient than 
the flow-insensitive one. Moreover flow-insensitive techniques can be used to 
analyze multithreaded programs. 

The idea of context-sensitive approach [26 23] is to produce a points-to rela- 
tion for the context of each call site of each procedure. On the other hand, the 
context-insensitive [27J pointer analysis produces one points-to relation for each 
procedure to cover contexts of all call sites. As expected the context-sensitive 
approach is more precise but less efficient than the context-insensitive one. 

Although the problem of pointer analysis for sequential programs was stud- 
ied extensively, a little effort was done towards a pointer analysis for mul- 
tithreaded programs. In ||28| , a flow sensitive analysis for multithreaded pro- 
grams was introduced. This analysis associates each program point with a triple 
of points-to relations. This in turn complicates the the analysis and creates a sort 
of redundancy in the collected points-to information. Investigating the details 
of this approach and our work makes it apparent that our work is simpler and 
more accurate than this approach. Moreover our approach provides a proof 
for the correctness of the pointer analysis for each program. To the best of 



our knowledge, such proof is not known to be provided by any other existing 
approach. 

Type systems in program analysis: The work in II1I2I3I29I30I31I is among the 
closest work to ours in the sense that it uses type systems to achieve the program 
analysis in a way similar to the present paper. The work in 11321 can be seen as 
a special case of our work for the case of while language where there is no 
threading nor pointer constructs. 

The work in [2] shows that a good deal of program analysis can be done using 
type systems. More precisely, it proves that for every analysis in a certain class of 
data-flow analyses, there exists a type system such that a program checks with 
a type if and only if the type is a supertype for the set resulting from running 
the analysis on the program. The type system in [33| and the flow-logic work 
in [3 1, which is used in [34 1 to study security of the coordinated systems, are 
very similar to [2J. Moreover, the work [33[ transforms logical statements about 
programs to statements about the program optimizations. For the simple while 
language, the work in [1 J introduces type systems for constant folding and dead 
code elimination and also logically proves correctness of optimizations. The 
bidirectional data-flow analyses and their program optimizations are treated 
with type systems in (35). Earlier, related work (with structurally-complex type 
systems) is (36). The work in [30] presents type systems that checks memory 
safety of multithreaded programs using sensitive-nonsensitive pointer analysis. 

To the best of our knowledge, our approach is the first attempt to use type 
systems to optimize multithreaded programs and associates every individual 
optimization with a justification for correctness. 
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